Dialogue SWE-Bench

mentions 1 type Person feed RSS

// recent coverage 1 mentions

04:00

2026-06-15

arxiv.org

ai-agents

Dialogue SWE-Bench: A Benchmark for Dialogue-Driven Coding Agents

Researchers introduced Dialogue SWE-Bench, a benchmark for evaluating AI coding agents through dialogue with users, revealing that coding proficiency does not guarantee strong dialogue skills. The stu…

// co-occurs with top 1 entities

arXiv 1